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Abstract 

Given an undirected graph G of n weighted edges, stored one edge per processor in a square mesh of 
n processors, we show how to determine the connected components and a minimal spanning forest in 
Q{y/n) time. More generally, we show how to solve these problems in 0(n^/'^) time when the mesh is 
a d-dimensional cube, where the implied constants depend upon d. 
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1 Introduction 

Note: these results were first announced by the author in 1984 ||3l, and were obtained contemporaneously 
by John Reif. We intended to publish a joint paper, but never got around to doing so. Since the results 
have been utilized many times, and over the years I’ve explained the algorithm to several people, I thought 
it useful to make this note available. I kept the title of the original announcement despite the fact that very 
few talk about algorithms for VLSI anymore. 

We give algorithms for determining a minimal spanning forest, and the connected components, of an 
undirected graph stored on a mesh-connected computer. Figure [T] a) shows a 2-dimensional mesh-connected 
computer. Each processor can directly communicate with its 4 neighbors in unit time, and messages to 
further away processors must be passed from neighbor to neighbor. A fine-grained model is used, where 
each processor has a fixed number of words of memory and does fundamental operations in unit time. This 
is a slight extension of cellular automata, where each automaton has only a fixed number of bits of memory. 

Meshes have long been used as a fundamental form of parallel computer, and were especially attractive 
for VLSI design because very little area was wasted on connections O. Meshes, though not as fine-grained, 
are once again important for chip design. Recent interest is motivated by the fact that the distance signals 
travel determines the time and energy needed. Energy consumption is becoming a dominant constraint 
of chip design, and chips with large numbers of simple processors connected as a mesh are becoming 
available ||5]| (see Figure [Ub)). Several proposed designs of processing chips for exascale computers assume 
there will be many RISC cores connected as a 2-d mesh HHH. 

The input for our problems is the edges of an undirected weighted graph G = {V, E) with n edges, 
stored one per processor on a ^/n x y/n mesh. Each edge {u,v) has an associated weight w{u,v) > 0, 
where for an unweighted graph w{u, n) = 1 is implied. Whenever edges are being moved their weight goes 
with them. We assume there is an ordering on the labels of the vertices. To simplify exposition, assume that 
each edge is represented twice, so that an edge between vertices u and v is stored as (u, v) and as {v, u). 
Further, for every vertex v there is a self-loop, i.e., an edge of the form (u, v). This guarantees that every 
vertex is represented. We will show: 
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Figure 1: Abstract and Implemented Meshes 


Theorem 1 Given an undirected weighted graph G = {V, E) with n edges, stored one edge per processor 
on an y/n x y/n mesh computer, in Q{y/n) time one can find a minimal spanning forest and label the 
connected components. 

The algorithm to find a minimal spanning forest (MSF) uses a recursive approach originally used by 
Borilkva and which has been rediscovered by many others (including Prim). For each vertex an edge is 
selected, forming a forest but not necessarily a spanning forest. The edges selected become part of the MSF, 
and the trees are supervertices used as the vertices in the next iteration. For example, for the graph in Figure|2] 
a), the edges selected are shown in b). This is known as coarsening. The edge between supervertices U and 
V is one having minimal weight among the edges connecting a vertex in U with one in V. Ties can be 
broken arbitrarily. The result is shown in c). If a supervertex is not connected to any other others then it 
is finished. Affer a coarsening sfep fhe number of unfinished superverfices is af mosf 1/2 fhe number of 
vertices since each such superverfex confains af leasf fwo vertices. 

Throughouf, all sorting is info a space-filling curve ordering, such as fhe Hilbert ordering illusfraled in 
Figure [T] a). Sorting in a y/n x y/n mesh can be done in Q{y/n) time HI. 

Minimal Spanning Forest Algorithm: 

1) Do coarsening 5 times, leaving at most n/32 remaining superverfices. 

2) In each quadrant of the mesh, recursively solve the MSF problem for the superverfices, using only 
the edges in the quadrant. The number of edges in a quadrant’s MSF is no more than the number of 
superverfices, so for all of the quadrants combined the number of edges is at most 4 • (re/32) = re/8. 

3) Move these edges to a submesh of size re/8 and recursively solve the MSF problem in this submesh. 
This uses the fact that a MSF of the union of the MSFs of the subgraphs is a MSF of the entire graph. 

Step 1) reduces the number of vertices, then step 2) reduces the number of edges. Without 2), even though 
the graph has fewer vertices at the end of 1), the number of edges may not have been reduced much, so step 
3) would not have been in a small submesh. 
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a) initial graph 


b) selected edges 
Figure 2: Coarsening 


c) supervertices 


Coarsening: 

a) For every vertex u select the edge (u, v) of minimal weight among all incident edges, where if there 
are ties select the one where v has minimal label. This is an edge in the MSF. If a vertex has no edges 
then it is removed from further consideration and already has its label. 

b) The selection rule insures that no cycles are created and hence the edges selected form a forest. Label 
the trees, creating the supervertices. 

c) For every pair of supervertices U, V, choose one of the edges of minimal weight among those with 
one endpoint in U and one in V. Ties can be broken arbitrarily. This is the edge between U and V in 
the coarsened graph. See Figure |2]c). 

While the trees created during coarsening are undirected, labeling them involves creating directed subtrees, 
with each edge pointing towards the root of its subtree. Furthermore, weights are ignored. 

Labeling Trees: 

i) Using only the tree edges selected in coarsening, for each vertex u determine the neighboring vertex v 
with the smallest label and create the directed edge (rt, v), where the edge is (u, u) if u’s label is less 
than all of its neighbors’. This directed edge represents u in the following steps. The edges selected 
from each tree form directed subtrees, as shown in Figure [3] 

ii) Sort the directed edges by the label of the vertex being pointed at. Because sorting is in space-filling 
curve order, vertices in any quadrant can only point to vertices in the same quadrant or a quadrant of 
smaller index. 

iii) For every vertex, find the label of the root of its subtree. This is done in a bottom-up fashion, using 
2x2 meshes, then 2^x22 meshes, etc. For vertex u let M(u, i) be the 2* x 2* mesh containing i, and 
let A(u, i) be its greatest ancestor (node closest to the root on the path from u to the root) represented 
in M(ti, i). Thus if x is A{u, i)’s parent then edge {A[u, i),x) is in M{u, i) while x’s representative 
is not in M{u,i). Initially 0) = v, where v is the neighbor selected in step i). To determine 
A{u,i + 1), note that if the parent of A{u, i) is in M{u,i + 1) then it must be in a quadrant of smaller 
index since its label is smaller than ^(rt, z)’s. Thus by 3 iterations of every quadrant of M{u,i + 1) 
searching in the quadrant preceding it, A{u,i + 1) can be determined. This is not a recursive call, 
merely a merging operation, thus this step can be completed in Q{^/n) total time. 
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Figure 3: Forming Directed Subtrees 


iv) In the preceding step we don’t actually keep track of i, but rather just keep updating a value A{u). At 
the end, for each vertex u for which A{u) = u and no neighbors point to u, if u has neighbors in the 
graph (such as vertex 5 in the example), choose an arbitrary neighbor v and set A{u) = A{v) If it has 
no neighbors then tt’s label is u and it is finished. The number of unfinished subfrees is af mosf 1/2 
fhe number of vertices since each unfinished subfree represenfs af leasf 2 vertices. 

v) The subfrees are now superverfices for fhis free labeling routine, and for any pair of superverfices 
eifher fhey are nof adjacenf in fhe original free or fhere is a unique edge in fhe original free fhaf 
connecfs fhem. Move fhe fhe edges connecting superverfices fo a submesh and recursively call fhe 
routine sfarfing af sfep ii). 

To analyze fhe time lef Tuspin), Tcoa.Tse{n), Ti^aheiin) denofe fhe lime for finding fhe minimal spanning 
foresl, coarsening, and labeling fhe frees, respeclively, on an ^/n x ^/n mesh. 

7Label(n) = TLabelW2) + 0(Vra) 

= 0(V^) 

Tc oarse (n) = TlabelH + 0(v^) 

= 0(Vre) 

Tuspin) = 5rcoarse(ra) + Tmsf(’^/4) + Tmsf(?^/8) + &{^/n) 

= 0(Vra) 

For componenf labeling, af fhe end of fhe MSF algorilhm one can do free labeling on fhe edges selecled 
lo defermine fhe componenf labels. 

Finally, all of fhe above can be extended fo d-dimensional meshes in a slraighlforward manner, fhe only 
difference being lhal 2d + 1 verfex reductions are needed in step 1). This gives 

Theorem 2 Given an undirected weighted graph G = {V, E) with n edges, stored one edge per processor 
on a cubical d-dimensional mesh computer, in time one can find a minimal spanning forest and 

label the connected components, where the implied constants depend upon d. □ 


4 




References 

[1] DOE, “Scientific grand challenges: architectures and technology for extreme scale computing”, Report 
of DOE Workshop, Dec. 2009 

[2] DOE, “Exascale programming challenges”. Report of DOE Workshop, July 2011. 

[3] Stout, Q.E., “Optimal component labeling algorithms for mesh-connected computers and VESl”, Ab¬ 
stracts AMS 5, 1984, 148. 

[4] Thompson, C.D. and Kung, H.T., “Sorting on a mesh-connected parallel computer”. Comm. ACM 20, 
1977, 263-271. 

[5] Tilera Corporation, “T1EE-Gx8072 Processor Product Brief”, 
www.tilera.com/sites/default/files/productbriefs/TlEE-Gx8072_PB041-04_WEB.pdf 

[6] Ullman, J.D., Computational Aspects of VLSI, 1984, Computer Science Press, Maryland. 


5 


